{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Custom output parsers\n",
    "\n",
    "If there is a custom format you want to transform a model's output into, you can subclass and create your own output parser.\n",
    "\n",
    "The simplest kind of output parser extends the [`BaseOutputParser<T>` class](https://api.js.langchain.com/classes/langchain_core_output_parsers.BaseOutputParser.html) and must implement the following methods:\n",
    "\n",
    "- `parse`, which takes extracted string output from the model and returns an instance of `T`.\n",
    "- `getFormatInstructions`, which returns formatting instructions to pass to the model's prompt to encourage output in the correct format.\n",
    "\n",
    "The `parse` method should also throw a special type of error called an [`OutputParserException`](https://api.js.langchain.com/classes/langchain_core_output_parsers.OutputParserException.html) if the LLM output is badly formatted, which will trigger special retry behavior in other modules.\n",
    "\n",
    "Here is a simplified example that expects the LLM to output a JSON object with specific named properties:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import {\n",
    "  BaseOutputParser,\n",
    "  OutputParserException,\n",
    "} from \"@langchain/core/output_parsers\";\n",
    "\n",
    "export interface CustomOutputParserFields {}\n",
    "\n",
    "// This can be more generic, like Record<string, string>\n",
    "export type ExpectedOutput = {\n",
    "  greeting: string;\n",
    "}\n",
    "\n",
    "export class CustomOutputParser extends BaseOutputParser<ExpectedOutput> {\n",
    "  lc_namespace = [\"langchain\", \"output_parsers\"];\n",
    "\n",
    "  constructor(fields?: CustomOutputParserFields) {\n",
    "    super(fields);\n",
    "  }\n",
    "\n",
    "  async parse(llmOutput: string): Promise<ExpectedOutput> {\n",
    "    let parsedText;\n",
    "    try {\n",
    "      parsedText = JSON.parse(llmOutput);\n",
    "    } catch (e) {\n",
    "      throw new OutputParserException(\n",
    "        `Failed to parse. Text: \"${llmOutput}\". Error: ${e.message}`,\n",
    "      );\n",
    "    }\n",
    "    if (parsedText.greeting === undefined) {\n",
    "      throw new OutputParserException(\n",
    "        `Failed to parse. Text: \"${llmOutput}\". Error: Missing \"greeting\" key.`,\n",
    "      );\n",
    "    }\n",
    "    if (Object.keys(parsedText).length !== 1) {\n",
    "      throw new OutputParserException(\n",
    "        `Failed to parse. Text: \"${llmOutput}\". Error: Expected one and only one key named \"greeting\".`,\n",
    "      );\n",
    "    }\n",
    "    return parsedText;\n",
    "  }\n",
    "\n",
    "  getFormatInstructions(): string {\n",
    "    return `Your response must be a JSON object with a single key called \"greeting\" with a single string value. Do not return anything else.`;\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then, we can use it with an LLM like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "object\n",
      "{\n",
      "  greeting: \"I am an AI assistant programmed to provide information and assist with tasks. How can I help you tod\"... 3 more characters\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "import { ChatPromptTemplate } from \"@langchain/core/prompts\";\n",
    "import { ChatOpenAI } from \"@langchain/openai\";\n",
    "\n",
    "const template = `Answer the following user question to the best of your ability:\n",
    "{format_instructions}\n",
    "\n",
    "{question}`;\n",
    "\n",
    "const prompt = ChatPromptTemplate.fromTemplate(template);\n",
    "\n",
    "const model = new ChatOpenAI({});\n",
    "\n",
    "const outputParser = new CustomOutputParser();\n",
    "\n",
    "const chain = prompt.pipe(model).pipe(outputParser);\n",
    "\n",
    "const result = await chain.invoke({\n",
    "  question: \"how are you?\",\n",
    "  format_instructions: outputParser.getFormatInstructions(),\n",
    "});\n",
    "\n",
    "console.log(typeof result);\n",
    "console.log(result);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Parsing raw model outputs\n",
    "\n",
    "Sometimes there is additional metadata on the model output that is important besides the raw text. One example of this is function calling, where arguments intended to be passed to called functions are returned in a separate property. If you need this finer-grained control, you can instead subclass the [`BaseLLMOutputParser<T>` class](https://api.js.langchain.com/classes/langchain_core_output_parsers.BaseLLMOutputParser.html). This class requires a single method:\n",
    "\n",
    "- `parseResult`, which takes a [`Generation[]`](https://api.js.langchain.com/interfaces/langchain_core_outputs.Generation.html) or a [`ChatGeneration[]`](https://api.js.langchain.com/interfaces/langchain_core_outputs.ChatGeneration.html) as a parameter. This is because output parsers generally work with both chat models and LLMs, and therefore must be able to handle both types of outputs.\n",
    "\n",
    "The `getFormatInstructions` method is not required for this class. Here's an example of the above output parser rewritten in this style:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "import {\n",
    "  BaseLLMOutputParser,\n",
    "  OutputParserException,\n",
    "} from \"@langchain/core/output_parsers\";\n",
    "import { ChatGeneration, Generation } from \"@langchain/core/outputs\";\n",
    "\n",
    "export interface CustomOutputParserFields {}\n",
    "\n",
    "// This can be more generic, like Record<string, string>\n",
    "export type ExpectedOutput = {\n",
    "  greeting: string;\n",
    "}\n",
    "\n",
    "function isChatGeneration(llmOutput: ChatGeneration | Generation): llmOutput is ChatGeneration {\n",
    "  return \"message\" in llmOutput;\n",
    "}\n",
    "\n",
    "export class CustomLLMOutputParser extends BaseLLMOutputParser<ExpectedOutput> {\n",
    "  lc_namespace = [\"langchain\", \"output_parsers\"];\n",
    "\n",
    "  constructor(fields?: CustomOutputParserFields) {\n",
    "    super(fields);\n",
    "  }\n",
    "\n",
    "  async parseResult(llmOutputs: ChatGeneration[] | Generation[]): Promise<ExpectedOutput> {\n",
    "    if (!llmOutputs.length) {\n",
    "      throw new OutputParserException(\"Output parser did not receive any generations.\");\n",
    "    }\n",
    "    let parsedOutput;\n",
    "    // There is a standard `text` property as well on both types of Generation\n",
    "    if (isChatGeneration(llmOutputs[0])) {\n",
    "      parsedOutput = llmOutputs[0].message.content;\n",
    "    } else {\n",
    "      parsedOutput = llmOutputs[0].text;\n",
    "    }\n",
    "    let parsedText;\n",
    "    try {\n",
    "      parsedText = JSON.parse(parsedOutput);\n",
    "    } catch (e) {\n",
    "      throw new OutputParserException(\n",
    "        `Failed to parse. Text: \"${parsedOutput}\". Error: ${e.message}`,\n",
    "      );\n",
    "    }\n",
    "    if (parsedText.greeting === undefined) {\n",
    "      throw new OutputParserException(\n",
    "        `Failed to parse. Text: \"${parsedOutput}\". Error: Missing \"greeting\" key.`,\n",
    "      );\n",
    "    }\n",
    "    if (Object.keys(parsedText).length !== 1) {\n",
    "      throw new OutputParserException(\n",
    "        `Failed to parse. Text: \"${parsedOutput}\". Error: Expected one and only one key named \"greeting\".`,\n",
    "      );\n",
    "    }\n",
    "    return parsedText;\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "object\n",
      "{\n",
      "  greeting: \"I'm an AI assistant, I don't have feelings but thank you for asking!\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "const template = `Answer the following user question to the best of your ability:\n",
    "Your response must be a JSON object with a single key called \"greeting\" with a single string value. Do not return anything else.\n",
    "\n",
    "{question}`;\n",
    "\n",
    "const prompt = ChatPromptTemplate.fromTemplate(template);\n",
    "\n",
    "const model = new ChatOpenAI({});\n",
    "\n",
    "const outputParser = new CustomLLMOutputParser();\n",
    "\n",
    "const chain = prompt.pipe(model).pipe(outputParser);\n",
    "\n",
    "const result = await chain.invoke({\n",
    "  question: \"how are you?\",\n",
    "});\n",
    "\n",
    "console.log(typeof result);\n",
    "console.log(result);"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Streaming\n",
    "\n",
    "The above parser will work well for parsing fully aggregated model outputs, but will cause `.stream()` to return a single chunk rather than emitting them as the model generates them:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  greeting: \"I'm an AI assistant, so I don't feel emotions but I'm here to help you.\"\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "const stream = await chain.stream({\n",
    "  question: \"how are you?\",\n",
    "});\n",
    "for await (const chunk of stream) {\n",
    "  console.log(chunk);\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This makes sense in some scenarios where we need to wait for the LLM to finish generating before parsing the output, but supporting preemptive parsing when possible creates nicer downstream user experiences. A simple example is automatically transforming streamed output into bytes as it is generated for use in HTTP responses.\n",
    "\n",
    "The base class in this case is [`BaseTransformOutputParser`](https://api.js.langchain.com/classes/langchain_core_output_parsers.BaseTransformOutputParser.html), which itself extends `BaseOutputParser`. As before, you'll need to implement the `parse` method, but this time it's a bit trickier since each `parse` invocation needs to potentially handle a chunk of output rather than the whole thing. Here's a simple example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "import { BaseTransformOutputParser } from \"@langchain/core/output_parsers\";\n",
    "\n",
    "export class CustomTransformOutputParser extends BaseTransformOutputParser<Uint8Array> {\n",
    "  lc_namespace = [\"langchain\", \"output_parsers\"];\n",
    "\n",
    "  protected textEncoder = new TextEncoder();\n",
    "\n",
    "  async parse(text: string): Promise<Uint8Array> {\n",
    "    return this.textEncoder.encode(text);\n",
    "  }\n",
    "\n",
    "  getFormatInstructions(): string {\n",
    "    return \"\";\n",
    "  }\n",
    "}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Uint8Array(0) []\n",
      "Uint8Array(2) [ 65, 115 ]\n",
      "Uint8Array(3) [ 32, 97, 110 ]\n",
      "Uint8Array(3) [ 32, 65, 73 ]\n",
      "Uint8Array(1) [ 44 ]\n",
      "Uint8Array(2) [ 32, 73 ]\n",
      "Uint8Array(4) [ 32, 100, 111, 110 ]\n",
      "Uint8Array(2) [ 39, 116 ]\n",
      "Uint8Array(5) [ 32, 104, 97, 118, 101 ]\n",
      "Uint8Array(9) [\n",
      "   32, 102, 101, 101,\n",
      "  108, 105, 110, 103,\n",
      "  115\n",
      "]\n",
      "Uint8Array(3) [ 32, 111, 114 ]\n",
      "Uint8Array(9) [\n",
      "   32, 101, 109, 111,\n",
      "  116, 105, 111, 110,\n",
      "  115\n",
      "]\n",
      "Uint8Array(1) [ 44 ]\n",
      "Uint8Array(3) [ 32, 115, 111 ]\n",
      "Uint8Array(2) [ 32, 73 ]\n",
      "Uint8Array(4) [ 32, 100, 111, 110 ]\n",
      "Uint8Array(2) [ 39, 116 ]\n",
      "Uint8Array(11) [\n",
      "   32, 101, 120, 112,\n",
      "  101, 114, 105, 101,\n",
      "  110,  99, 101\n",
      "]\n",
      "Uint8Array(4) [ 32, 116, 104, 101 ]\n",
      "Uint8Array(5) [ 32, 115, 97, 109, 101 ]\n",
      "Uint8Array(4) [ 32, 119, 97, 121 ]\n",
      "Uint8Array(7) [\n",
      "   32, 104, 117,\n",
      "  109,  97, 110,\n",
      "  115\n",
      "]\n",
      "Uint8Array(3) [ 32, 100, 111 ]\n",
      "Uint8Array(1) [ 46 ]\n",
      "Uint8Array(8) [\n",
      "   32,  72, 111, 119,\n",
      "  101, 118, 101, 114\n",
      "]\n",
      "Uint8Array(1) [ 44 ]\n",
      "Uint8Array(2) [ 32, 73 ]\n",
      "Uint8Array(2) [ 39, 109 ]\n",
      "Uint8Array(5) [ 32, 104, 101, 114, 101 ]\n",
      "Uint8Array(3) [ 32, 116, 111 ]\n",
      "Uint8Array(5) [ 32, 104, 101, 108, 112 ]\n",
      "Uint8Array(4) [ 32, 121, 111, 117 ]\n",
      "Uint8Array(5) [ 32, 119, 105, 116, 104 ]\n",
      "Uint8Array(4) [ 32, 97, 110, 121 ]\n",
      "Uint8Array(10) [\n",
      "   32, 113, 117, 101,\n",
      "  115, 116, 105, 111,\n",
      "  110, 115\n",
      "]\n",
      "Uint8Array(3) [ 32, 111, 114 ]\n",
      "Uint8Array(6) [ 32, 116, 97, 115, 107, 115 ]\n",
      "Uint8Array(4) [ 32, 121, 111, 117 ]\n",
      "Uint8Array(5) [ 32, 104, 97, 118, 101 ]\n",
      "Uint8Array(1) [ 33 ]\n",
      "Uint8Array(4) [ 32, 72, 111, 119 ]\n",
      "Uint8Array(4) [ 32, 99, 97, 110 ]\n",
      "Uint8Array(2) [ 32, 73 ]\n",
      "Uint8Array(7) [\n",
      "   32,  97, 115,\n",
      "  115, 105, 115,\n",
      "  116\n",
      "]\n",
      "Uint8Array(4) [ 32, 121, 111, 117 ]\n",
      "Uint8Array(6) [ 32, 116, 111, 100, 97, 121 ]\n",
      "Uint8Array(1) [ 63 ]\n",
      "Uint8Array(0) []\n"
     ]
    }
   ],
   "source": [
    "const template = `Answer the following user question to the best of your ability:\n",
    "\n",
    "{question}`;\n",
    "\n",
    "const prompt = ChatPromptTemplate.fromTemplate(template);\n",
    "\n",
    "const model = new ChatOpenAI({});\n",
    "\n",
    "const outputParser = new CustomTransformOutputParser();\n",
    "\n",
    "const chain = prompt.pipe(model).pipe(outputParser);\n",
    "\n",
    "const stream = await chain.stream({\n",
    "  question: \"how are you?\",\n",
    "});\n",
    "\n",
    "for await (const chunk of stream) {\n",
    "  console.log(chunk);\n",
    "}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For more examples, see some of the implementations [in @langchain/core](https://github.com/langchain-ai/langchainjs/tree/main/langchain-core/src/output_parsers)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Deno",
   "language": "typescript",
   "name": "deno"
  },
  "language_info": {
   "file_extension": ".ts",
   "mimetype": "text/x.typescript",
   "name": "typescript",
   "nb_converter": "script",
   "pygments_lexer": "typescript",
   "version": "5.3.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
